Configuring sar for your system

Before you can tune a system properly, you must decide which system characteristics are important, and which ones are less so. Once you decide your priorities, you then need to find a way to measure the system performance according to those priorities. In fact, the system activity reporter programs are a good measuring tool for many aspects of system performance. In this article, we'll introduce you to the sar utility, which can give you detailed performance information about your system.

What does sar measure?

Since system tuning involves the art of finding acceptable compromises, you need the ability see the impact of your changes on multiple subsystems. System activity reporter (SAR) programs collect system-performance information in distinct groups. Table A shows how sar groups the performance information. The first column shows the switch you give to sar in order to request that particular information group, and the second column briefly describes the information group.

Table A
Switch Performance Monitoring Group
A All monitoring groups
a File access statistics
b Buffer activity
c System call activity
d Block device activity
g Paging out activity
k Kernel memory allocation
m Message and semaphores
p Paging in activity
q CPU Run queue statistics
r Unused memory and disk pages
u CPU usage statistics (default)
v Report status of system tables
w System swapping and switching
y TTY device activity

One way you can run sar is to specify a sampling interval and the number of times you want it to run. So, if you want to check the file-access statistics every 20 seconds for the next five minutes, you'd run sar like this:This whole listing is the command? It's just the first row, isn't it? What follows is the results of the command?

$ sar -a 20 15

SunOS Devo 5.5.1 Generic_103641-08 i86pc    11/05/97

01:06:02  iget/s namei/s dirbk/s
01:06:22     270     397     278
01:06:42     602     785     685
01:07:02     194     238     215

Configuring sar to collect data

Notice that you can't just run sar right now. If you try to run the sar command without first configuring it, it gives you an error message like this:
$ sar -a 20 15
sar: can't open /var/adm/sa/sa03
No such file or directory
Sure enough, if you look at the /var/adm/sa directory, you won't see any files in it, much less that sa03 file it's complaining about. If you create a blank file, using touch, for example, sar will start to work. However, why must you do something so strange to make sar work? And if you try to run sar tomorrow, you'll get a similar error, but this time it will complain about a different file, such as sa04.

It turns out that the sar program is only one part of the performance monitoring package. Three commands in the /usr/lib/sa directory also contribute to the whole. The sadc command collects system data and stores it to a binary file, suitable for sar to use. The shell script sa1 is a wrapper for sadc, suitable for use in cron jobs, so it can be run automatically. The sa2 script is a wrapper for sar that forces it to print a report in ASCII format from the binary information in the files sadc creates.

If you run the sa1 script as intended, it creates a binary file containing all the performance statistics for the day. This file allows sar to read the data and report on it without forcing you to wait and collect it. Since you may want to investigate the data a bit later, or compare one days' worth of information against another, the sar, sa1, and sa2 programs name the data file using the same format: /var/adm/sa/saX, where X is the day number. Therefore, when you run sar, one of the first things it does is look for today's binary file. When it doesn't find the file, it prints the error.

The best way to run sa1 and sa2 is from a cron job. Sun provides an example of how to create the cron job instead of forcing you to figure it out for yourself. Thus, if you edit the crontab for the account sys, you'll see commented-out sample cron schedules for sa1 and sa2, as shown in Figure A.

Figure A: The sys account already has prototype entries for running sa1 and sa2, which you can uncomment and use.

#ident  "@(#)sys        1.5     92/07/14 SMI"   /* SVr4.0 1.2   */
#
# The sys crontab should be used to do performance collection. See cron
# and performance manual pages for details on startup.
#
#0 * * * 0-6 /usr/lib/sa/sa1
#20,40 8-17 * * 1-5 /usr/lib/sa/sa1
#5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A
The first cron schedule uses sa1 to take a snapshot of system performance at the beginning of every hour every day. The second cron schedule adds a snapshot at 20 minutes (:20) and 40 minutes (:40) after the hour between 8:00A.M. and 5:00 P.M., every Monday through Friday. As a result, you get more detail during business hours, and less during the evenings and weekends.

The final line schedules sa2 to run at 6:05 every Monday through Friday to create an ASCII report from the data collected by sa1. This ASCII data is stored using a similar filename convention: /var/adm/sa/sarX, again where X is the day number.

The simplest way to configure sar to run is to edit the sys account's crontab and remove the # signs from the start of the sa1 and sa2 command lines. However, you may want to customize the cron schedules to suit your own preferences. For example, your company might run multiple shifts, and you may want more detailed data. Thus, you can modify the cron job to run sa1 at 15-minute intervals, every business day.

You can't just log into the sys account and edit the cron job, though, because the sys account is usually locked. Instead, you must log in as root, then su to the sys account, like so:

$ su
Password:
# su sys
#
At this point, be sure to set the EDITOR environment variable to your favorite editor, and edit the crontab file, like this:
# EDITOR=vi
# export EDITOR
# crontab -e
Now, your favorite editor (vi, in this case) comes up, and you can edit the cron schedules. For our example, we just want to run sa1 every 15 minutes every day, and the sa2 program should generate ASCII versions of the data just before midnight. So we'll change the cron schedule to look like this:
0,15,30,45 * * * 0-6 /usr/lib/sa/sa1
55 23 * * 0-6 /usr/lib/sa/sa2 -A
Next, we save the file and exit, and crontab will start the appropriate cron jobs for us. That's all you must do to configure sar. Once you do so, you can use sar without worrying about the file open errors any more.

Using the binary data files

Once the system is creating the binary data files, you can use sar without specifying the interval between samples and the number of samples you want to take. You can simply specify the data sets you want to see, and sar will print all that's accumulated thus far for the day. Therefore, if you're interested in CPU use and paging activity, you'd run sar as shown in Figure B. Since we ran sar near the end of the day, and we're sampling every 15 minutes, we're inundated with details. That's the major problem with detail--it's easy to get swamped.

Figure B: The sar -up command reports detailed information about the CPU and paging use up to the current time.

$ sar -up

SunOS Devo 5.5.1 Generic_103641-08 i86pc 11/04/97

00:00:01    %usr    %sys    %wio   %idle
00:15:00       0       0       0      99
00:30:00       0       0       0      99
00:45:00       0       1       0      99
22:15:00       0       0       0      99
22:30:00       0       0       0      99
22:45:00       1       1       3      95
Average        3       1       4      92

00:00:01  atch/s  pgin/s ppgin/s  pflt/s  vflt/s slock/s
00:15:00    0.00    0.02    0.03    1.82    2.93    0.00
00:30:00    0.00    0.00    0.00    4.35    6.15    0.00
00:45:00    0.00    0.02    0.02   38.95   44.79    0.00

Getting the bigger picture

While getting a detailed picture of your system is wonderful, you probably don't need or want such a detailed report very often. After all, your job is to manage the system, not micromanage it. Do you think the president of your company monitors the details of the day-to-day operations of the company? Of course not--the president is happy to see the weekly reports showing that the business is chugging along smoothly. It's only when the business is having problems that the president starts to examine and analyze details. Your role as system administrator is similar to that of the company president: As long as the system is running smoothly, you merely want to glance at a report to see that everything is going nicely. You don't want to delve into a morass of details unless something's awry. Consequently, what we usually want from sar isn't a detailed report on all the system statistics, but rather a simple summary.

The sar command provides three command-line switches to let you control how you want sar to summarize its data. The -s and -e options allow you to select the starting and ending times of the report, and the -i option allows you to specify the reporting interval. So you can see an hourly summary of CPU usage during working hours by using sar like this:

$ sar -s 08 -e 18 -i 3600 -u

SunOS Devo 5.5.1 Generic_103641-08 i86pc    11/03/97

08:00:00    %usr    %sys    %wio   %idle
09:00:01       0       1       2      97
10:00:00       3       3       1      94
11:00:00       0       0       0     100
12:00:00       0       0       0     100
13:00:00       0       0       0     100
14:00:00       0       0       0     100
15:00:00       5      56      30       8
16:00:01       3      68      24       5
17:00:00       0      11      10      79
18:00:00       0       0       0     100

Average        1      14       7      78
If we had a performance problem during the day, we could quickly tell when it occurred using this summary report. Then, we'd adjust our s, e, and i options to focus on the details we're actually interested in seeing. Instead of wading through pages of data, we can be selective.

Conclusion

Once you get sar configured, it can capture all the performance statistics for your machine. It's a good idea to browse through the man page for sar a few times to get acquainted with the values it can capture. You don't have to understand all of it, especially at the beginning. To start with, it's a good policy to become familiar with the numbers when your system is operating normally, because then you'll be able to pinpoint which system characteristics are degrading, and begin addressing the problems.